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SIMULTANEOUS  APPROXIMATION  BY  GREEDY  ALGORITHMS 


D.  Leviatan1  and  V.N.  Temlyakov2 

Abstract.  We  study  nonlinear  m- term  approximation  with  regard  to  a  redundant  dictionary 
T>  in  a  Hilbert  space  H.  It  is  known  that  the  Pure  Greedy  Algorithm  (or,  more  generally,  the 
Weak  Greedy  Algorithm)  provides  for  each  f  £  H  and  any  dictionary  T>  an  expansion  into  a 
series 

OO 

f  =  ^2cj(f)(Pj(f),  Vj(f)£V,  j  =  1,2,... 

3  =  1 

with  the  Parseval  property:  ||/||2  =  Yj  |cj(/)|2.  Following  the  paper  of  A.  Lutoborski  and 
the  second  author  [21]  we  study  analogs  of  the  above  expansions  for  a  given  finite  number  of 
functions  f1,...,  fN  with  a  requirement  that  the  dictionary  elements  ipj  of  these  expansions 
are  the  same  for  all  /*,  *  =  1, . . . ,  N.  We  study  convergence  and  rate  of  convergence  of  such 
expansions  which  we  call  simultanious  expansions. 


1.  Introduction 

In  this  paper  we  study  nonlinear  approximation.  The  basic  idea  behind  nonlinear  ap¬ 
proximation  is  that  the  elements  used  in  the  approximation  do  not  come  from  a  fixed  linear 
space  but  are  allowed  to  depend  on  the  function  being  approximated.  The  classical  problem 
in  this  regard  is  the  problem  of  to- term  approximation  where  one  fixes  a  basis  in  the  space, 
and  seeks  to  approximate  a  target  function  /  by  a  linear  combination  of  to  terms  from  that 
basis.  When  the  basis  is  a  wavelet  basis  or  a  basis  of  other  waveforms,  then  this  type  of 
approximation  is  the  starting  point  for  compression  algorithms.  An  important  feature  of 
approximation  using  a  basis  Hi  :=  of  a  Banach  space  X  is  that  each  function  /  G  X 

has  a  unique  representation 


(i-i)  f  =  ^2ck(f)^k 

k= i 

and  we  can  identify  /  with  the  set  of  its  coefficients  {cfc(/)}^=1.  The  problem  of  to- term 
approximation  with  regard  to  a  basis  has  been  studied  thoroughly  and  rather  complete 
results  have  been  established  (see  [2],  [4]  [6],  [9]  [11],  [15],  [19]  [23],  [25]  [27],  [31],  [34]  [37], 

1Part  of  this  work  was  done  while  the  first  author  visited  the  University  of  South  Carolina  in  January  2003. 

2This  research  was  supported  by  the  National  Science  Foundation  Grant  DMS  0200187  and  by  ONR  Grant 
N00014-96-1-1003. 
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[42],  [43]).  In  particular,  it  was  established  that  the  greedy  type  algorithm  which  forms  a 
sum  of  to  terms  with  the  largest  ||cfc(/)V’fe||x  out  of  expansion  (1.1),  in  many  cases  almost 
realizes  the  best  m-term  approximation  for  function  classes  ([5]),  and  even  for  individual 
functions  ([35],  [23]). 

Recently,  there  has  emerged  another  more  complicated  form  of  nonlinear  approximation 
which  we  call  highly  nonlinear  approximation.  It  takes  many  forms  but  has  the  basic 
ingredient  that  the  basis  is  replaced  by  a  larger  system  of  functions  that  is  usually  redundant. 
We  call  such  systems  dictionaries.  Redundancy  on  the  one  hand  offers  much  promise  for 
greater  efficiency  in  terms  of  approximation  rate,  but  on  the  other  hand  gives  rise  to  highly 
nontrivial  theoretical  and  practical  problems.  Approximation  with  regard  to  a  redundant 
dictionary  has  been  studied  in  [1],  [3],  [4],  [7],  [8],  [12]  [14],  [16]  [18],  [24],  [28]  [30],  [32], 
[33],  [38]  [42]  and  other  papers.  We  refer  the  reader  to  surveys  [4]  and  [42]  for  a  discussion 
of  approximation  results  for  redundant  dictionaries. 

We  recall  some  notations  and  definitions  from  the  theory  of  approximation  with  regard 
to  redundant  systems.  Let  H  be  a  real  Hilbert  space  with  an  inner  product  (•,  •)  and  the 
norm  ||x||  :=  {x,x)x!2 .  We  say  a  set  V  of  functions  (elements)  from  H  is  a  dictionary  if 
each  g  G  V  has  norm  one  (||g||  =  1)  and  span V  =  H.  In  [7],  the  second  author  and  DeVore 
studied  the  following  greedy  algorithm.  If  /  6  if,  one  lets  g  =  g(f)  E  V  be  the  element 
from  V  which  maximizes  |(/,  g)\  (of  course  for  this  one  makes  an  additional  assumption  that 
such  a  maximizer  always  exists),  and  define 

(1-2)  G(f):=G(f,V):=(f,g)g, 

and 

(1.3)  R(f)  :=R(f,V)  :=f-G(f). 

Pure  Greedy  Algorithm  (PGA).  Let  Ro{f)  :=  Ro(f,V)  :=  f  and  Go(f)  :=  0.  Then, 
for  each  m  >  1,  we  inductively  define 

Gm(f )  :  =  Gm(f,v)  :=  Gto_!(/)  +  G(Rm^(f)) 

RmU)  :  =  Rm(f,V)  :=f-  Gm(f)  =  R(Rm-\{f))- 

For  a  given  dictionary  V  we  can  introduce  a  norm  associated  with  V  as 

1 1  / 1  |x>  :=  sup \(f,g)\. 

g£V 

The  Weak  Greedy  Algorithm  (see  [39])  is  defined  as  follows.  Let  the  sequence  r  =  {tk}kLi, 
0  <  tk  <  1,  be  given. 

Weak  Greedy  Algorithm  (WGA).  Let  /o  :=  /.  Then  for  each  to  >  1,  we  inductively 
define: 

1.  Let  (p^  G  V  be  any  element  satisfying 

l(/m-l.^m)l  >  tmWffn-lW'D', 
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2. 


/m  :=  frn—  1  “  (./' 


r 

m 


15  ^Pm 


)<P- 


T  . 

m? 


3. 

m 

3  =  1 

We  note  that  in  a  particular  case  tk  =  t,  k  =  1,  2, . . . ,  this  algorithm  was  considered  in 
[17].  Thus,  the  WGA  is  a  generalization  of  the  PGA  in  the  direction  of  making  it  easier 
to  construct  an  element  ip Tm  at  the  ra-th  greedy  step.  Note  that  the  WGA  includes,  in 
addition  to  the  first  (greedy)  step,  a  second  step  (see  2.,  3.  in  the  above  definition)  where 
we  update  the  approximant  by  adding  to  it,  the  orthogonal  projection  of  the  residual  f^n_1 
onto  Therefore,  the  WGA  provides  for  each  f  E  H  an  expansion  into  a  series  (a  greedy 
expansion) 


(1-4)  %(/):=<//-!,  U>- 

i=i 

In  general  it  is  not  an  expansion  into  orthogonal  series  but  it  has  some  similar  properties. 
The  coefficients  Cj(f )  of  an  expansion  are  obtained  by  the  Fourier  formulas  with  /  replaced 
by  the  residuals  fj_  1.  It  is  easy  to  see  that 

(i-5)  \\fu\2  =  \\rm-i\\2  -\cm{f)\2. 

Therefore,  for  a  convergent  greedy  expansion  we  get  an  analogue  of  the  Parseval  formula 
for  orthogonal  expansions: 

OO 

ll/ll2  =  Emdi2- 

3= 1 

The  problem  of  convergence  of  the  WGA  is  now  settled  in  the  following  sense.  In  [40],  a 
class  V  of  sequences  it  has  been  introduced,  such  that  the  condition  r  ^  V  is  necessary  and 
sufficient  for  the  convergence  of  a  Weak  Greedy  Algorithm  with  weakness  sequence  r  for 
each  /  G  IF,  and  all  Hilbert  spaces  H  and  dictionaries  V  (see  [40]  for  the  history  of  this 
problem).  For  a  general  dictionary  V,  we  define  the  class  of  functions 

A°(V,M)  :=  {/  G  H  :  f  =  ^  ckwk ,  wk  G  V,  #A  <  oo  and  ^  \ck\  <  M} 

k£ A  k£ A 

and  we  define  A\(T> ,  M)  as  the  closure  (in  H)  of  A° (T>,  M ).  Furthermore,  we  define  A\(T>) 
as  the  union  of  the  classes  AiiV.M)  over  all  M  >  0.  For  /  G  Ai(T>),  we  define  the  norm 
|/|a.i(x>,oo) >  as  the  smallest  M  such  that  /  G  Ai(V,M). 

For  M  =  1  we  denote  A\(V)  :=  Ai(T>,  1).  The  rate  of  convergence  of  the  PGA  and  the 
WGA  for  elements  from  A\(V)  has  been  studied  in  [7],  [24],  [39],  [28],  [41].  The  following 
result  has  been  obtained  in  [39]. 
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Theorem  1.1.  Let  V  be  an  arbitrary  dictionary  in  H .  Assume  r  :=  {tk}kLi  is  a  nonin¬ 
creasing  sequence.  Then  for  f  e  A\  ( V )  we  have 


(1.6)  llf-GTm(f,V)l\  <(l+£tir,™/2|2+‘™). 

k= 1 


While  Theorem  1.1  is  valid  for  nonincreasing  weakness  sequence,  we  obtain  in  Section  2 
an  upper  estimate  for  the  rate  of  convergence  of  the  WGA  for  a  class  of  weakness  sequences 
which  includes  nonmonotone  sequences. 

Theorem  1.2.  Assume  a  weakness  sequence  r  =  {ffcjfcli  has  the  property  that  there  are  a 
natural  number  n,  and  a  real  number  0  <  t  <  1 ,  such  that  the  inequality 

( l+l)n 

n1  £  t\>t\ 

k=lnJr  1 


holds  for  all  l  =  0, 1,  2, ... .  Then  if  f  G  Ai(V),  then  for  any  0  <  8  <  1  we  have 

\\f[J2<(3n/S2)^(l  +  U2)-^ 


with  a  :=  t(  1  —  5). 

We  also  prove  in  Section  2  that  Theorem  1.2  is  sharp  in  a  certain  sense. 

The  main  purpose  of  this  paper  is  to  construct  greedy  type  (1.4)  expansions  for  a  given 
finite  set  of  elements  f1, . . . ,  fN ,  simultaneously  with  the  same  sequence  {(pj)  for  all  f\ 
i  =  1  The  first  result  in  this  direction  has  recently  been  obtained  in  [30].  The 

Vector  Greedy  Algorithms  that  are  designed  for  the  purpose  of  constructing  mth  greedy 
approximants,  simultaneously  for  a  given  finite  number  of  elements,  have  been  introduced 
and  studied  in  [30].  Namely, 

Vector  Weak  Greedy  Algorithm  (VWGA).  Let  a  vector  of  elements  fl  E  H ,  i  = 

1, . . . ,  N  be  given.  We  write  /o’“’T  :=  /*.  Then  for  each  m  >  1,  we  inductively  define: 

1.  Let  ep^T  G  V  be  any  element  satisfying 


max  I  (frkV-Ti  iTv™)\>  tm  max  1 1 1 


l,V,T< 

m—  1 1 


V, 


2. 


_  !fi,v,T  v,T\  V ,T  j  —  1  AT 

Jm  '  J  m  —  1  \Jm-l)  rm  /rm.  >  L  -*■)•••  )  iv  ) 


C(/‘,  x>)  :=  £</jL'iT,  i  =  1, ....  AT. 

3  = 1 


3. 
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It  was  proved  in  [30]  that  under  certain  conditions  on  r  the  VWGA  converges.  Therefore 
VWGA  provides  the  convergent  expansions 


/*  =  £<>}». 

3  =  1 

with  the  property 

OO 

n/ii2  =  £if>;i2,  <  =  i . at. 

i=i 


The  following  estimate  of  the  rate  of  convergence  of  VWGA  has  been  obtained  in  [30]. 


Theorem  1.3.  Let  V  be  an  arbitrary  dictionary  in  H .  Assume  r  :=  {tkl^Li, 
1, . . . ,  0  <  t  <  1.  Then  for  any  vector  of  elements  f1, . . .  ,  fN,  fl  G  Ai(V),  i 
we  have 


£  II  & 


<  (N  +  mt2y,/{2N+t) 


2JV+3t 
_/V  21V  +t 


i=  1 


tk  —  t j  k  — 

=  1 


We  will  improve  this  estimate  in  Section  3,  proving 


Theorem  1.4.  Let  V  be  an  arbitrary  dictionary  in  H .  Assume  r  :=  {tk}kLi,  tk  =  t,  k  >  1, 
0  <  t  <  1.  Then  for  any  vector  of  elements  f1, . . .  ,  fN,  fl  G  A\(V),  i  =  1, . . . ,  N,  we  have 


\\f^v’T\\2  <N2(1  +  mt2/N )  . 

1=1 

In  addition  to  the  VWGA  we  will  consider  in  Section  3  two  modifications  of  the  VWGA. 
The  modifications  differ  from  the  VWGA  only  in  the  first  step.  We  modify  this  step  in 
the  following  two  ways.  In  the  first  step  of  the  Simultaneous  Weak  Greedy  Algorithm  1 
(SWGA1) 

l.(SWGAl)  We  look  for  any  <pWT  G  V  satisfying 


(1.9) 


N 


(£  |{  /; 


lDSl’ 
m—  1)  r  m 


)|2)1/2  >  tm  max  \  \ff 


m  l  W'Dj 


fi  ._  fi,s  1,t 
J  m—1  *  J  77i  —  l  * 


i=  1 


In  the  first  step  of  the  Simultaneous  Weak  Greedy  Algorithm  2  (SWGA2) 
l.(SWGA2)  We  look  for  any  (p^,T  G  V  satisfying 


N 


(1.10) 


(£i<  n 


ios2’t\ 
1  lTm  / 


2)1/2  >tmSUp(V|(/^_lJ^)|2)i/iSJ  frn—  1  :=  f, 


2—1 


N 

vm  snp(y^ 

i= 1 


1/2 


p  i,s2,r 

m—1  * 


Clearly,  any  satisfying  (1.8)  or  (1.10)  also  satisfies  (1.9).  Thus,  any  upper  estimate  for 
the  SWGA1  yields  an  upper  estimate  for  both  the  VWGA  and  the  SWGA2.  We  prove  in 
Section  3  an  extension  of  Theorem  1.4  which  holds  for  both  variants  of  the  Simultaneous 
Weak  Greedy  Algorithm  (see  Theorem  3.1). 
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2.  Rate  of  convergence  of  WGA 
The  following  lemma  is  due  to  [39]. 

Lemma  2.1.  Let  {am}^=0  be  a  sequence  of  nonnegative  numbers  satisfying  the  inequalities 
a0  U  A,  Oim.  Si  l(l  l/A),  TTl  1,2,..., 

with  0  <  tk  <  1,  k  =  1,  2, . . . .  Then  for  each  m  we  have 

m 

am  <  ^(1  +  ^2  tf*)  1- 

k=  1 

We  need  the  following  modification  of  this  lemma. 

Lemma  2.2.  Let  A  >  2  and  0  <  fin  <  1,  n  =  1,  2, . . . .  Suppose  1  >  xo  >  aq  >  •  •  •  >  0, 
satisfy  the  recurrent  inequalities 

(2.1)  xn  <  xn_i  -  ^x2n. 

Then  we  have 

3  m 

(2.2)  xm<-A{  l  +  ^2fin)-\  m=  1,2,.... 

n=l 


Proof.  We  will  use  the  following  simple  inequality 

(2.3)  (1  Px)-1  <  1  -  -x,  0  <  x  <  1/2. 

3 

We  rewrite  (2.1)  in  the  form 


(2.4) 


X , 


1  -i  .  fin 

i(l  +  -jXi 


)  <  Xr 


1- 


Clearly  xn_i  =  0  implies  xn  = 
we  get  from  (2.4) 


0.  Thus  it  suffices  to  prove  (2.2)  for  nonzero  xm. 

^  —  1 1 1  ,  fin  \  —  1  ^  —  1  2  /3n 

—  Xn  (1  A  ~AXn  —  Xn  —  ’ 


Using  (2.3) 


or 


—  Xn- 1 


2  fin 

34' 


x , 


9  m  9  m 

a1  >  UT 1  +  n  @n  -  1  +  TT  - 


n=  1 


n=  1 


2 

3T 


(1  +  Ai)- 

n=l 


This  implies 
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F  inally 


xr 


<  Ati  +  Y.^r1-  □ 


n= 1 


We  are  ready  to  prove  Theorem  1.2 
Proof  of  Theorem  1.2.  Denote 

am  :=  ll/mll  )  Dm  :=  |(/m  — 1)  Tm)  I)  HI  =  1,  2,  .  . 

Recalling  (1.5) 


J/o  :=  0. 


which  can  be  rewritten  as 


(2.5)  CLm  —  dm- 1  Umi 

we  conclude  that  ym  <  1,  m  >  0.  Let  the  sequence  {&„}  be  defined  by 

(2.6)  b0:=n/5 ,  bm  :=  6m_i  +  ym,  m  =  l,2, - 

Then,  evidently,  E  ^4i(T),5m).  By  Lemma  3.5  of  [7],  we  get 

SUP  >  1 1  fm—  1 1 1 2  /bm—  1 ) 

g£V 

which  in  turn  implies  (by  the  definition  of 

(2-7)  ym  P  tmQ'm—l/bm  —  l- 

Denote 

(Z+l)n 

xi  :=  atn ,  zi  :=  (  ^  yl)1/2  <  n1/2,  and  wt  :=  n~1/2bin. 

k=ln-\- 1 

Then  (2.5)  and  (2.6)  imply 

(El)  xi+1  =  xi  -  zf, 


(E2)  wi+i<wi  +  zi, 

and  (2.7)  together  with  (1.7)  and  the  fact  that  {x{\  is  decreasing  and  {wi}  is  increasing, 
yields 

(E3)  Zl>t^. 

wi+ 1 
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Now,  combining  (El)  and  (E3)  it  follows  that 


xi+ 1  <  xi  -V 


\wi+i) 


or 


I  i  i  _t2  Xl-\~  1  .  , 

X[-\-\  (  1  A  t  2  ]  A  X[. 


W 


l+l 


Again  by  the  monotonicity  of  {wi}  we  obtain 


w 


Z+l 


Xl+ 1  /  i  ^2  ^Z+l  ^  ^  -^2 


wz+i  /  W 


Hence,  by  Lemma  2.2  with  A  =  2,  /3n  =  t2,  n  =  1,  2, . . . ,  we  have 


(2.8) 

Also,  (El)  and  (E3)  imply 


xi 


AL  <  3(1  +  It2)-1. 


w. 


Xl+ 1  <  Xi  -  Zit 


Xl-\- 1 
Wz+1  ' 


or 


(2.9) 


xi+ 1  (  1  +  )  <  x/. 

«>z+i/ 


At  the  same  time  (E2)  implies 

(2.10)  wi+i  <wi(l  + zi/wi). 

Thus,  combining  (2.9)  and  (2.10)  we  conclude  that 


(2.11) 


<  Xi. 


Since  zi  <  n1/2  and  wi  >  Wq  :=  n1/2/^,  it  follows  that  zi/wi  <  S  for  all  l.  For  a  :=  f (1  —  5) 
we  apply  (2.10)  and  the  inequality 


x 


(1  +  x)a  <  1  +  ax  <  1  +  t - ,  0  <  x  <  5, 

1  +  x 


xi+1w?+1  <  xi+ iw?(l  +  zi/wi)a 


<  Xi+ i(l  +  t 


Zi/wi 
1  +  Zl/u'l 


<  XlW “  <  XqWq 


<  (r++6)°, 


to  obtain 
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where  in  the  third  inequality  we  applied  (2.11).  Hence,  by  (2.8)  we  obtain 

x2+a  <  3a(l  +  lt2)-ax2wfa 
<  {3n/S2)a(l  +  lt2)-a, 


and 

xi  <  (3n/S2)  (1  +  lt2)~^“  . 

This  completes  the  proof  of  Theorem  1.2.  □ 

An  immediate  consequence  of  Theorem  1.2  is 
Corollary  2.1.  Let  n  >2  and  1  <  i  <  n  be  given,  and  set 


(2.12) 


1,  k  =  In  +  i ,  l  =  0,1,2,..., 
0  otherwise. 


Then  if  f  e  Ai(V),  we  have  the  upper  estimate  for  the  error  of  the  WGA 

(2.13)  \\fin\\2  <  {3n/82)^{l  +  ln-1y^  =  {3n2 /S2)^  {l  +  ,  0  <  5  <  1, 

with  a  =  (1  —  d)n-1/2. 

Thus,  we  see  that  the  exponent  in  (2.13)  decreases  with  n  at  the  rate  n-1/2.  We  will 
show  that  for  the  particular  case  of  a  weakness  sequence  of  the  form  (2.12)  the  dependence 
of  the  exponent  in 

Wfinf  <C(n)(l  +  l)-^ 

is  indeed  of  order  <  Cn^1!2 . 

To  this  end  we  use  the  construction  of  Vt  from  Section  2  of  [29].  We  begin  with  the 
Equalizer  procedure.  Namely,  let  H  be  a  Hilbert  space  with  an  orthonormal  basis  {e:/}°T1. 
For  two  elements  e^,  ej,  i  /  j,  and  for  a  positive  number  t  <1/3  the  following  procedure  is 
called  ’’equalizer”  and  is  denoted  E(e*,ej,f). 

Equalizer  T?(e*,  ej,  t).  Set  /o  :=  e*  and  gi  :=  one;  —  (1  —  af)1/2ej  with  ol\  :=  t.  Clearly, 
||<7i||  =  1  and  ( fo,gi )  =  t.  We  define  inductively  the  sequences  / 1, . . . ,  /at;  g2,  ■  ■  ■  ,gN',  and 
0(2  >  0, . . . ,  ckat  >  0,  with  N  determined  by  the  process.  Let 

fn  ■  fn—  1  (f n—\,  9n) 9n,  and  gnjr\  .  C%n-\ -lTi  (1  ^n+l)  ^  C?  > 


where  an+i  >  0  satisfies 


Note  that 


(fn,  9n+ 1)  —  t,  n  —  1,2,.... 


II  fn 


\\f 


n  —  1 1 


-t2 


? 


(2.14) 


10 


D.  LEVIATAN  AND  V.N.  TEMLYAKOV 


so  that  we  can  solve  for  a.n+i  >  0  as  long  as  N  <  [t  2],  Writing  fn  =:  anei  +bnej,  it  follows 
that 

an  —  nn— i  tan,  bn  —  bn—\  T  t(  1  c^n)  ;  n  ^  2, 

(  A .  _L  O  j  . 

CLn  bn  &n—l  bn—  i  f(Q!n  ”1"  (1  ^n)  )>  ^  —  2, 

so  that,  in  particular,  an  —  bn  in  decreasing.  Also  by  virtue  of  the  inequality 

l<x+(l_a;2)l/2<2l/2)0<;E<li 

we  see  that 

(2.16)  Q>n— i  bn— i  A  bn  -)-  \/2t. 

We  proceed  this  way  as  long  as 

an  -  bn  >  V2t, 

arriving  at  N  =  Nt,  such  that 

a,N-i  —  6jv-i  >  V% t  and  cin  —  6/v  <  v^2t. 

Note  that  by  (2.15)  and  (2.16), 

<2-17>  1  <  *  4 

At  this  stage  we  modify  the  Nth  step  as  follows.  We  take  gw  ■=  2 _1/2(ej  —  ej)  and  define 

In  =  In- i  ~  (/jv-i,  9n)9n- 
It  is  clear  that  cin  =  bN ,  and  by  virtue  of  (2.16), 

(2.18)  t  <  ( /jv-i,<Mr )  <  2 f. 

It  follows  from  (2.14)  and  (2.17)  that 

||/a-i||2  >  l  —  t  + 12, 


and,  in  turn,  by  (2.18),  we  have 

||/iv||2  >  ll/iv-ill2  -  4t2  >  ||/||2  -t-3t2. 

Evidently,  E(ei,ej,t )  is  a  WGA  with  respect  to  the  dictionary  V(i,  j)  :=  {e*,  gi,  g%, . . . ,  (?jv}, 
with  the  ’’weakness”  parameter  t.  It  is  worthwhile  to  note  that  the  values  {ctk},  {a^}  and 
{bk},  k  =  1, . . . ,  W,  and  the  stopping  stage  W,  depend  only  on  t,  and  are  independent  of 
the  choice  of  e,  and  ej.  Also,  Nt  increases  as  t  decreases,  it  is  constant  for  a  while  and  then 
jumps  up  by  1.  Thus,  we  take  /i  >  3,  and  t  =  <  2~M,  such  that  Nt  =  2M. 
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This  can  be  done  since  by  virtue  of  (2.17),  if  t  =  2  then  Nt  <  2M,  and  if  t  =  2  ^  1,  then 
JVf  >  2M+1/2  -  1  >  2^. 

We  define  a  WGA  with  respect  to  the  dictionary  Vt  :=  U(jj)esX>(«,  j)  where  S  is  de¬ 
termined  by  the  equalizer  procedures  {E(ei,  ej,  defined  above  that  will  be  used  in 

the  construction  that  follows.  We  begin  with  /  :=  e\  and  apply  _E(ei,e2,f),  t  :=  After 
Nt  =  2M  steps  we  obtain  g® , . . .  ,  g°Nt ,  and 


f1  :=  ci(ei  +e2),  h  :=  2c?, 

with  the  property 

H/1!!2  =  /i,  h  >  1  —  f  —  3f2. 

We  now  obtain  g\,...,  g\N  ,  by  applying  the  equalizers  E(e\,e$,t)  and  -E(e2,  e4,  f).  Thus 
after  2 Nt  additional  steps  of  the  WGA,  we  have 

f2  ■=  c2(e i -1 - t-e4),  c2  =  c2, 


with  the  property 

||/2||2  =  4c2  =h2. 

After  g  iterations  we  have  made  steps,  where 

m-i 

=  Nt^2k  =  2M(2M  -  1)  =:  n  -  1, 

fc=0 


and  obtained 

/M:=cM(e  H - he2.),  icq. 

At  the  nth  step  (n  =  22/z  —  2M  +  1),  we  remove  c^o2m  by  the  PGA  step 


fn:  =  F-(F,e2.)e2. 

=  cn{e i  +  •  •  •  +  e2M-i),  c2  =  hM2  A 


Indeed, 


sup(/M,#)  =  =  (/M,e2„). 

g£V 


We  proceed  as  follows  to  obtain  /M+1.  We  apply  the  equalizer  procedure  -E(e4,  e2M+i,  fM), 
. . . ,  _E(e2M_i,  62m+2m-i,  t^),  thus,  we  perform  2^(2^  —  1)  =  n  —  1  additional  steps  of  the 
WGA.  We  get 


/M+1  cM_|_i(ei  +  •  •  •  +  e2M-i  +  e2M+i  +  •  •  •  +  e2M+i_i), 


and  we  remove  cM+ie2M_i,  to  obtain  f2n. 
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Suppose  that  at  the  nth  iteration,  (n  >  /j,  +  1),  we  have  arrived  at 

/m„  :=  Cv  ^2  eit  (%  =  hv2-v,  =  {y  <  i2  <  ■  ■  ■  <  iLv)  C  [1,  2*']. 

i€zA.u 


We  begin  performing  the  (n  +  l)st  iteration  by  applying  the  equalizer  procedure 
E(eil,e2»+i,tfl), . . . ,  E(ei2M_1,e2-+2M-i,tA1).  Thus,  we  have  performed  2^(2^  -  1)  =  n  -  1 
steps  of  the  WGA.  Since  i2»-i  <  iLv,  we  remove  cveiLv  by  a  PGA  as  in  the  nth  step.  We 
now  apply  £(ei2M  ,  e2-+2M ,  tM), . . . ,  E(eii2ti+1_2,  e2v+2n+i_2,  ^),  and  if  i2»+i_ 2  <  iL„- i,  we 
remove  cl/eiL]  _1,  and  keep  going  until  we  can  no  longer  continue.  This  means  that  either 
the  n  —  1  st  equalizer  is  applied  to  the  last  remaining  element  in  Au,  or  that  we  are  left  with 
less  than  n  —  1  elements.  In  the  former  case  we  have  arrived  at 

(2.19)  ,T+1  :=  c„+1  cl+1  =  h"*12-’-\  A  C  [1, 2"+1], 

i£  A 

With  A  :=  max  A,  we  then  remove  Cj,+ieA  in  the  nth  step,  and  denote  A^+i  :=  A\{A}  c 
[1,  2^+1],  In  the  latter  case  we  form  equalizers  for  the  remaining  elements,  and  obtain  (2.19). 
We  now  perform  as  many  WGA  steps  of  the  form 

r+1-0(r+\e,)eh  i<  A, 

as  needed  in  order  to  have  a  total  of  n  —  1  steps  and  in  the  nth  step  we  remove  c„+ie\.  As 
a  result  in  both  cases,  after  Mu+i  steps,  we  have 

fMv+1  ■■=  cu+1  Y,  e*,  cl+i  =  hv+12~v-\  A„_,  c;  1.2"-'  .  |A V+1\=:LU+1. 

It  is  clear  that  we  have  removed  at  most  \Lu/( 2M  —  1)]  elements  e,>  Therefore, 

/2M  —  2  1  A 

(2.20)  Lv+1  >  2{LV  -  Lv  j (2M  -  1)  -  1)  =  2 Lv  (  ^  -  —  J  >  2L„(1  -  2^+1), 

and 

(2-21)  II/m„+1H2  =  4+in+i  >  h-+12-'L„(l  -  =  h(  1  -  2-''+1)||/„J|2. 

Also 


Mu+i  >  Mu  +  (Lu  —  ( \Lvf  (2M  —  1)]))2M  +  \Lvj (2M  —  1)] 

>  Mv  +  Lu2^  —  \LV/ (2M  —  1)]  (2M  —  1)  >  Mu  +  Lv(2^  —  2). 


Taking  into  account  that 


M„  =  22/x  -2^  +  1,  and  L„  =  2^-1, 
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we  get  by  (2.20) 

(2.22)  Mv  >  (2(1  -  2^+1))‘'~/V/x(2^  -  2)  >  C{fj)2cv,  v  >  //, 

with  absolute  constant  c  >  0,  since  fi  >  3.  After  M*,  steps  we  have  by  (2.21) 

||/mJ|2  >  ^(l  -  2-^+1)^||/mJ|2  >  (1  -  2^+1)2^+1 
>  C(n) 2~ClI'2“M  >  C{h)Mvc*2-\ 

where  we  have  applied  the  fact  that  ||/mJ|2  =  hM(  1  —  2~M),  and  for  the  last  inequality  we 
used  (2.22).  Observing  that  n  - 1/2  <  v/22_At,  we  conclude  that  the  exponent  of  the  power 
rate  of  decrease  of  ||/mJ|2  is  of  order  of  n-1/2. 

3.  Simultaneous  approximation  by  greedy  algorithm 

Given  are  a  Hilbert  space  H  and  a  dictionary  V.  For  N  >  2,  let  Hjy  :=  H  x  •  •  •  x  H ,  N 
times,  i.e.,  the  general  element  in  Hjy  is  F  :=  (f1, . . . ,  fN),  fk  G  H.  It  is  a  Hilbert  space 
with  the  inner  product 

N 


(F,,F2>:=£;</‘,/‘>. 


k= 1 


Let  T>n  be  the  collection 


N 


{(«i gi,  ■■■ ,  «jv5jv)  |  ()k  e  ^  a2  =  1}. 


fc=i 

Then  it  is  easy  to  see  that  spamD/v  =  i?7v-  (Actually,  IL/v  is  spanned  even  by  linear 
combinations  of  elements  of  the  form  (0, . . . ,  0,  g,  0, . . . ,  0),  where  g  G  D  is  arbitrary  and  is 
in  arbitrary  position.)  Also,  all  elements  in  V ^  are  normalized. 

We  begin  with  F0  :=  (/q  , . . . ,  f^)  and  a  sequence  0  <  tm  <  1  and  we  want  to  construct 
weak  greedy  approximation  from  V,  simultaneously  to  all  N  functions.  For  a  given  F  we 
are  looking  for  an  element  G  G  T>n  of  a  special  form 

(3.1)  G:=G(F,g):=(p1g,p2g,...,pNg),  g  G  V, 

N 

1/2 


i—  1 


For  G  of  the  form  (3.1)  the  operation 

Fx  :=  F  —  (F,  G)G 

means  the  same  operation  performed  coordinatewise 

fi--=fi-{f\g)9 ,  i  =  i,...,N. 

We  note  that 


N 


N 


(3.2) 


\f\\vn  =  sup  \^2{f,gi)al\  =  Q2\\r\\l) 


2  \l/2 


a:  =  (a1,...,aN) 

IM|2=i 

9i,---,9n€V 


i=  1 
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Lemma  3.1.  For  any  F  G  Hn  we  have 

sup  \  (F,  G(F,g))  \  >  max  ||/*||x>  >  Ar1/2||JF||X,JV. 

g£V 


Proof.  On  the  one  hand, 


N 

sup \(F,G(F,g))\  =  sup(J]  \(f',g)\2)1/2 

(3.3)  gev  , 

>  max  sup \{fl,g}\  =  max  \\f  \\v, 

1  g£V  1 

and  on  the  other,  by  (3.2), 

N 

(3.4)  ||F|b„  =  (J2  ll/ill) 1/2  <  N1/2  max \\f%. 

i= 1 

Combining  (3.3)  and  (3.4),  completes  the  proof  of  Lemma  3.1.  □ 

Given  a  weakness  sequence  r  =  {tk}kLi-  The  upper  estimate  for  the  VWGA,  namely, 
f°r  ^2iLi  ll/m^ll2;  can  be  obtained  by  Lemma  3.1  from  the  corresponding  upper  estimate 
for  the  WGA  with  the  weakness  sequence  r1  :=  {f/cA’~1/2}£C1.  Actually  we  do  better,  we 
formulate  two  theorems  which  are  valid  for  VWGA  and  for  both  SWGA1  and  SWGA2. 
Thus  let  s  stand  for  either  v  or  si  or  s2. 

Theorem  3.1.  Let  V  be  an  arbitrary  dictionary  in  Fd .  Assume  r  :=  {tk}kLi  is  a  nonin¬ 
creasing  sequence.  Then  for  any  vector  of  elements  f1, . . .  ,  fN,  f  e  Ai(V),  i  =  1 , ,N, 
we  have 

N  i  m  _t 

V  ii/r-i3  <  w2(i  +  ■ 

i= 1  k= 1 

Corollary  3.1.  Let  V  be  an  arbitrary  dictionary  in  H.  Assume  r  :=  {tk}kLi,  tk  =  t, 
k  >  1,  0  <  t  <  1.  Then  for  any  vector  of  elements  f1, . . .  ,  fN,  fl  G  A\{V),  i  =  1, . . . ,  N, 
we  have 

A  =3 

5^H/m,T||2  <  iV2(l  +  mt2  /N)  “1/2+t . 

i= 1 

Note  that  for  s  =  v,  Corollary  3.1  coincides  with  Theorem  1.4. 

Proof.  The  proof  follows  from  Theorem  1.1  and  Lemma  3.1,  when  we  observe  that  fl  G 
A1(V),i  =  l,...,N  implies  (f1, . . . ,  fN)  G  A\(fDN,  N).  □ 


A  similar  proof  yields 
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Theorem  3.2.  Assume  that  for  the  weakness  sequence  r  =  {tfcjfcli  there  are  a  natural 
number  n  and  a  real  number  0  <  t  <  1  such  that 

( l+l)n 

n-1  t2>t2,  1  =  0,1,2,.... 

k=ln-\- 1 


Then  for  any  0  <  <5  <  1  and  all  fl  G  Ai(V),  i  =  1, . . . ,  N, 


N 

£||/-’l2  <  N2(3n/52)^  (l  +  lt2) 

i= 1 


2  +  r 


with  r  :=  t(l  —  S)N  1/2 . 

We  are  in  a  position  to  discuss  the  convergence  of  the  VWGA,  SWGA1,  and  SWGA2. 
We  denote  by  V  the  class  of  all  sequences  x  =  {xk}kLi,  Xk  >  0,  k  =  1,  2, . . . ,  for  which  there 
exists  a  sequence  0  =  qo  <  qi  <  . . .  such  that, 


E 


2s 

A  qs 


<  oo, 


where  A qs  :=  qs  —  qs- 1,  and 

oo  qs 

<  °°- 

s=l  k= 1 

Remark  3.1.  It  is  clear  from  this  definition  that  if  x  G  V  and  for  some  K  >  1  and  c  we 
have  0  <  yk  <  k  >  K,  then  y  :=  G  V.  The  following  theorem  has  been  proved 

in  [40]. 

Theorem  3.3.  The  condition  r  ^  V  is  necessary  and  sufficient  for  the  convergence  of  the 
Weak  Greedy  Algorithm  with  a  weakness  sequence  r,  for  each  f  and  all  Hilbert  spaces  H 
and  dictionaries  V. 

It  is  clear  from  Theorem  3.3  that  the  condition  r  V  is  also  necessary  for  convergence 
of  the  VWGA,  SWGA1,  and  SWGA2  with  the  weakness  sequence  r.  It  has  been  proved  in 
[30]  that  this  condition  (r  f  V)  is  also  sufficient  for  the  convergence  of  the  VWGA.  We  note 
that  r  =  {tk}  f  V  implies  r'  :=  { tk.N ~V2}  V.  Thus  Theorem  3.3  combined  with  Lemma 
3.1  implies  the  following  generalization  of  Theorem  3.3. 

Theorem  3.4.  The  condition  r  ^  V  is  necessary  and  sufficient  for  the  convergence  of  each 
of  the  algorithms  VWGA,  SWGA1,  SWGA2  with  a  weakness  sequence  r,  for  each  vector  of 
elements  f1, . . . ,  fN ,  N  arbitrary,  and  all  Hilbert  spaces  H  and  dictionaries  V. 

Theorems  3.1  and  3.2  give  estimates  for  the  iff  -norm  of  the  residual  vector  (||/^J|, . . . ,  ||/^||). 
We  wish  to  introduce  greedy  type  algorithms  that  yield  estimates  for  the  f^-norm  of  the 
residual  vector.  We  define  the  Alternating  Weak  Greedy  Algorithm  for  N  elements  (AWGA). 
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Again,  it  differs  from  the  VWGA  only  at  the  first  step  (out  of  three)  of  each  iteration.  Let 
t  £  (0, 1].  At  the  mth  iteration,  m  =  IN  +  i.  in  the  first  step  of  the  AWGA 
l.(AWGA)  We  look  for  any  £  V  satisfying 


u 


i,a,T  a, 7 

1 )  ^Pm 


)\>m 


i,a,T\ 


-l\\T>- 


It  is  clear  that  for  each  i  any  realization  of  the  AWGA  for  the  ith  component  fl  can  be 
viewed  as  a  realization  of  the  WGA  with  the  weakness  sequence  r*  := 


t 


i 

k 


1,  k  —  IN  -\-  if  l  —  0, 1,  2, ... , 
0  otherwise. 


Theorem  3.5.  Given  f1'  £  Ai(V),  i  =  1, . . . ,  N,  the  AWGA  yields  the  estimates 
\\fiN\\2  <{3N2/S2)^(l  +  l)-^,  0<S<1,  1  <i<N, 

with  a  =  (1  —  5)N~1/2 . 


References 

[1]  Andrew  R.  Barron,  Universal  approximation  bounds  for  superposition  of  n  sigmoidal  functions,  IEEE 
Transactions  on  Information  Theory  39  (1993),  930-  945. 

[2]  A.  Cohen,  R.A.  DeVore,  and  R.  Hochmuth,  Restricted  Nonlinear  Approximation,  Constructive  Approx. 
16  (2000),  85-113. 

[3]  G.  Davis,  S.  Mallat,  and  M.  Avellaneda,  Adaptive  greedy  approximations,  Constr.  Approx.  13  (1997), 
57-98. 

[4]  R.A.  DeVore,  Nonlinear  Approximation,  Acta  Numerica  (1998),  51-150. 

[5]  R.  DeVore,  B.  Jawerth,  and  V.  Popov,  Compression  of  wavelet  decompositions,  Amer.  J.  of  Math.  114 
(1992),  737-785. 

[6]  R.A.  DeVore  and  V.N.  Temlyakov,  Nonlinear  approximation  by  trigonometric  sums,  J.  Fourier  Anal, 
and  Appl.  2  (1995),  29-48. 

[7]  R.A.  DeVore  and  V.N.  Temlyakov,  Some  remarks  on  Greedy  Algorithms,  Advances  in  comp.  Math.  5 
(1996),  173-187. 

[8]  R.A.  DeVore  and  V.N.  Temlyakov,  Nonlinear  approximation  in  finite- dimensional  spaces,  J.  Complexity 
13  (1997),  489-508. 

[9]  S.J.  Dilworth,  N.J.  Kalton,  D.  Kutzarova,  V.N.  Temlyakov,  The  Tresholding  Greedy  Algorithm,  Greedy 
Bases,  and  Duality,  IMI-Preprints  series  23  (2001),  1-23. 

[10]  D.L.  Donoho,  Unconditional  bases  are  optimal  bases  for  data  compression  and  for  statistical  estimation, 
Appl.  Comput.  Harmon.  Anal.  1  (1993),  100-115. 

[11]  D.L.  Donoho,  CART  and  Best- Ortho- Basis:  A  Connection,  Preprint  (1995),  1-45. 

[12]  M.  Donahue,  L.  Gurvits,  C.  Darken,  E.  Sontag,  Rate  of  convex  approximation  in  non-Hilbert  spaces, 
Constr.  Approx.  13  (1997),  187-220. 

[13]  V.V.  Dubinin,  Greedy  Algorithms  and  Applications,  Ph.D.  Thesis,  University  of  South  Carolina,  1997. 

[14]  J.H.  Friedman  and  W.  Stuetzle,  Projection  pursuit  regression,  J.  Amer.  Stat.  Assoc.  76  (1981),  817-823. 

[15]  R.  Gribonval  and  M.  Nielsen,  Some  remarks  on  non-linear  approximation  with  Schauder  bases,  East 
J.  Approx.  7  (2001),  267-285. 

[16]  P.J.  Huber,  Projection  Pursuit,  Annals  of  Stat.  13  (1985),  435-475. 

[17]  L.  Jones,  On  a  conjecture  of  Huber  concerning  the  convergence  of  projection  pursuit  regression,  Annals 
of  Stat.  15  (1987),  880-882. 


SIMULTANEOUS  APPROXIMATION  BY  GREEDY  ALGORITHMS 


17 


[18]  L.  Jones,  A  simple  lemma  on  greedy  approximation  in  Hilbert  space  and  convergence  rates  for  projection 
pursuit  regression  and  neural  network  training ,  Annals  of  Stat.  20  (1992),  608-613. 

[19]  A.  Kamont  and  V.N.  Temlyakov,  Greedy  approximation  and  the  multivariate  Haar  system,  IMI-Preprint 
series  20  (2002),  1-24. 

[20]  B.  S.  Kashin  and  V.  N.  Temlyakov,  On  best  m-terms  approximations  and  the  entropy  of  sets  in  the 
space  L1 ,  Math.  Notes  56  (1994),  57-86. 

[21]  B.S.  Kashin  and  V.N.  Temlyakov,  On  estimating  approximative  characteristics  of  classes  of  functions 
with  bounded  mixed  derivative,  Math.  Notes  58  (1995),  922-925. 

[22]  G.  Kerkyacharian  and  D.  Picard,  Entropy,  universal  coding,  approximation  and  bases  properties,  Uni¬ 
versity  of  Paris  6  and  7,  Preprint  663  (2001),  1-32. 

[23]  S.V.  Konyagin  and  V.N.  Temlyakov,  A  remark  on  greedy  approximation  in  Banach  spaces,  East  J.  on 
Approx.  5  (1999),  1-15. 

[24]  S.V.  Konyagin  and  V.N.  Temlyakov,  Rate  of  convergence  of  Pure  Greedy  Algorithm,  East  J.  on  Approx. 
5  (1999),  493-499. 

[25]  S.V.  Konyagin  and  V.N.  Temlyakov,  Convergence  of  Greedy  Approximation  I.  General  Systems,  IMI- 
Preprint  series  08  (2002),  1-19. 

[26]  S.V.  Konyagin  and  V.N.  Temlyakov,  Convergence  of  Greedy  Approximation  II.  The  Trigonometric 
system,  IMI-Preprint  series  09  (2002),  1-25. 

[27]  S.V.  Konyagin  and  V.N.  Temlyakov,  Greedy  Approximation  with  regard  to  bases  and  general  minimal 
systems,  Serdica  Math.  J.  28  (2002),  305-328. 

[28]  E.D.  Livshitz,  On  the  rate  of  convergence  of  greedy  algorithm,  Manuscript  (2000). 

[29]  E.D.  Livshitz  and  V.N.  Temlyakov,  On  convergence  of  Weak  Greedy  Algorithms,,  IMI-Preprint  13 
(2000),  1-9. 

[30]  A.  Lutoborski  and  V.N.  Temlyakov,  Vector  Greedy  Algorithms,  IMI-Preprint  10  (2002),  1-16. 

[31]  P.  Oswald,  Greedy  algorithms  and  best  m-term  approximation  with  respect  to  biorthogonal  systems, 
Preprint  (2000),  1-22. 

[32]  L.  Rejto  and  G.G.  Walter,  Remarks  on  projection  pursuit  regression  and  density  estimation,  Stochastic 
Analysis  and  Application  10  (1992),  213-222. 

[33]  E.  Schmidt,  Zur  Theorie  der  linearen  und  nichtlinearen  Integralgleichungen.  I,  Math.  Annalen  63 
(1906-1907),  433-476. 

[34]  V.N.  Temlyakov,  Greedy  algorithm  and  m-term  trigonometric  approximation,  Constr.  Approx.  14 
(1998),  569-587. 

[35]  V.N.  Temlyakov,  The  best  m-term  approximation  and  Greedy  Algorithms,  Advances  in  Comp.  Math. 
8  (1998),  249-265. 

[36]  V.N.  Temlyakov,  Nonlinear  m-term  approximation  with  regard  to  the  multivariate  Haar  system,  East 
J.  Approx.  4  (1998),  87-106. 

[37]  V.N.  Temlyakov,  Greedy  algorithms  with  regard  to  the  multivariate  systems  with  a  special  structure, 
Constr.  Approx.  16  (2000),  399-425. 

[38]  V.N.  Temlyakov,  Greedy  algorithms  and  m-term  approximation  with  regard  to  redundant  dictionaries, 
J.  Approx.  Theory  98  (1999),  117-145. 

[39]  V.N.  Temlyakov,  Weak  greedy  algorithms,  Advances  in  Comp.  Math.  12  (2000),  213-227. 

[40]  V.N.  Temlyakov,  A  criterion  for  convergence  of  Weak  Greedy  Algorithms,  Advances  in  Comp.  Math. 
17  (2002),  269-280. 

[41]  V.N.  Temlyakov,  Two  lower  estimates  in  greedy  approximation,  IMI-Preprint  series  07  (2001),  1-12. 

[42]  V.N.  Temlyakov,  Nonlinear  Methods  of  Approximation,  IMI-Preprint  series  09  (2001),  1-57. 

[43]  P.  Wojtaszczyk,  Greedy  algorithms  for  general  systems,  J.  Approx.  Theory  107  (2000),  293-314. 

School  of  Mathematical  Sciences,  Sackler  Faculty  of  Exact  Sciences,  Tel  Aviv 

University,  Tel  Aviv  69978,  Israel 

leviatan@math.  tau.  ac.il 

Department  of  Mathematics,  University  of  South  Carolina,  Columbia,  SC  29208  USA 

temlyak  @math.  sc.edu 


